Improving the Use of Pseudo-Words for Evaluating Selectional Preferences

نویسندگان

  • Nathanael Chambers
  • Daniel Jurafsky
چکیده

This paper improves the use of pseudowords as an evaluation framework for selectional preferences. While pseudowords originally evaluated word sense disambiguation, they are now commonly used to evaluate selectional preferences. A selectional preference model ranks a set of possible arguments for a verb by their semantic fit to the verb. Pseudo-words serve as a proxy evaluation for these decisions. The evaluation takes an argument of a verb like drive (e.g. car), pairs it with an alternative word (e.g. car/rock), and asks a model to identify the original. This paper studies two main aspects of pseudoword creation that affect performance results. (1) Pseudo-word evaluations often evaluate only a subset of the words. We show that selectional preferences should instead be evaluated on the data in its entirety. (2) Different approaches to selecting partner words can produce overly optimistic evaluations. We offer suggestions to address these factors and present a simple baseline that outperforms the state-ofthe-art by 13% absolute on a newspaper domain.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Neural Network Approach to Selectional Preference Acquisition

This paper investigates the use of neural networks for the acquisition of selectional preferences. Inspired by recent advances of neural network models for nlp applications, we propose a neural network model that learns to discriminate between felicitous and infelicitous arguments for a particular predicate. The model is entirely unsupervised – preferences are learned from unannotated corpus da...

متن کامل

Disambiguating Noun and Verb Senses Using Automatically Acquired Selectional Preferences

Our system for the SENSEVAL-2 all words task uses automatically acquired selectional preferences to sense tag subject and object head nouns, along with the associated verbal predicates. The selectional preferences comprise probability distributions over WordN et nouns, and these distributions are conditioned on WordNet verb classes. The conditional distributions are used directly to disambiguat...

متن کامل

Inferring Selectional Preferences from Part-Of-Speech N-grams

We present the PONG method to compute selectional preferences using part-of-speech (POS) N-grams. From a corpus labeled with grammatical dependencies, PONG learns the distribution of word relations for each POS N-gram. From the much larger but unlabeled Google N-grams corpus, PONG learns the distribution of POS N-grams for a given pair of words. We derive the probability that one word has a giv...

متن کامل

The Impact of Selectional Preference Agreement on Semantic Relational Similarity

Relational similarity is essential to analogical reasoning. Automatically determining the degree to which a pair of words belongs to a semantic relation (relational similarity) is greatly improved by considering the selectional preferences of the relation. To determine selectional preferences, we induced semantic classes through a Latent Dirichlet Allocation (LDA) method that operates on depend...

متن کامل

A Random Walk Approach to Selectional Preferences Based on Preference Ranking and Propagation

This paper presents an unsupervised random walk approach to alleviate data sparsity for selectional preferences. Based on the measure of preferences between predicates and arguments, the model aggregates all the transitions from a given predicate to its nearby predicates, and propagates their argument preferences as the given predicate’s smoothed preferences. Experimental results show that this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010